-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add warning about restart migration #116769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We have gotten more than one SDH due to customers not understanding why restarts involving fully-mounted indices can pull a lot of data from the snapshot tier, so it may help to be more explicit about why this happens and how it can be avoided.
Documentation preview: |
Pinging @elastic/es-docs (Team:Docs) |
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (with two minor edits).
It's worth noting that if a searchable snapshot index has no replicas (as is the default | ||
in the cold tier), then when the node hosting it is shut down, allocation will immediately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to avoid the default mention in this section, we speak of replicas just above this section.
It's worth noting that if a searchable snapshot index has no replicas (as is the default | |
in the cold tier), then when the node hosting it is shut down, allocation will immediately | |
It's worth noting that if a searchable snapshot index has no replicas, then when the node hosting it is shut down, allocation will immediately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
in the cold tier), then when the node hosting it is shut down, allocation will immediately | ||
try to relocate the index to a new node in order to maximize availability. For fully mounted | ||
indices this will result in the new node downloading the entire index snapshot from | ||
the cloud repository, which might be expensive especially during rolling restarts. Temporarily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we generally expect those costs to be very low and only exceptionally for it to have real cost implications, so I'd like to remove the cost mention:
the cloud repository, which might be expensive especially during rolling restarts. Temporarily | |
the cloud repository. Temporarily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a comment suggestion.
The other place I was thinking could use additional documentation would be the rolling restart docs -- or maybe in the allocation settings under allocation.enable
, primaries
. We could mention that searchable snapshots without replicas will specially not be reallocated when primaries
is set. This doesn't need to hold up this PR, though, just an idea for an additional improvement if you feel like it.
multiple clusters and use <<modules-cross-cluster-search,{ccs}>> or | ||
<<xpack-ccr,{ccr}>> instead of {search-snaps}. | ||
|
||
It's worth noting that if a searchable snapshot index has no replicas, then when the node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd delete "It's worth noting that" and start the sentence as is without it -- you're writing it, so it's obviously worth noting :)
Should this be a WARNING block as well, like directly above? Seems a little strange to have text -> warning -> text again; and this is sort of a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged just before your comment, apologies!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, looks like a near perfect race condition 😄
💚 Backport successful
|
We have gotten more than one SDH due to customers not understanding why restarts involving fully-mounted indices can pull a lot of data from the snapshot tier, so it may help to be more explicit about why this happens and how it can be avoided.
We have gotten more than one SDH due to customers not understanding why restarts involving fully-mounted indices can pull a lot of data from the snapshot tier, so it may help to be more explicit about why this happens and how it can be avoided.
We have gotten more than one SDH due to customers not understanding why restarts involving fully-mounted indices can pull a lot of data from the snapshot tier, so it may help to be more explicit about why this happens and how it can be avoided.
We have gotten more than one SDH due to customers not understanding why restarts involving fully-mounted indices can pull a lot of data from the snapshot tier, so it may help to be more explicit about why this happens and how it can be avoided.
We have gotten more than one SDH due to customers not understanding why restarts involving fully-mounted indices can pull a lot of data from the snapshot tier, so it may help to be more explicit about why this happens and how it can be avoided.